Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anand Venkataraman

Techniques for effective vocabulary selection

Jun 04, 2003

Anand Venkataraman, Wen Wang

Figure 1 for Techniques for effective vocabulary selection

Figure 2 for Techniques for effective vocabulary selection

Figure 3 for Techniques for effective vocabulary selection

Figure 4 for Techniques for effective vocabulary selection

Abstract:The vocabulary of a continuous speech recognition (CSR) system is a significant factor in determining its performance. In this paper, we present three principled approaches to select the target vocabulary for a particular domain by trading off between the target out-of-vocabulary (OOV) rate and vocabulary size. We evaluate these approaches against an ad-hoc baseline strategy. Results are presented in the form of OOV rate graphs plotted against increasing vocabulary size for each technique.

* 4 pages. To appear Proc. Eurospeech 2003, Geneva

Via

Access Paper or Ask Questions

A Statistical Model for Word Discovery in Transcribed Speech

Nov 30, 2001

Anand Venkataraman

Abstract:A statistical model for segmentation and word discovery in continuous speech is presented. An incremental unsupervised learning algorithm to infer word boundaries based on this model is described. Results of empirical tests showing that the algorithm is competitive with other models that have been used for similar tasks are also presented.

* Computational Linguistics, 27(3), pp.352--372, 2001
* Expanded version of ICML-01 paper (pp.569--576)

Via

Access Paper or Ask Questions

A procedure for unsupervised lexicon learning

Nov 30, 2001

Anand Venkataraman

Figure 1 for A procedure for unsupervised lexicon learning

Abstract:We describe an incremental unsupervised procedure to learn words from transcribed continuous speech. The algorithm is based on a conservative and traditional statistical model, and results of empirical tests show that it is competitive with other algorithms that have been proposed recently for this task.

* Proceedings of the eighteenth international conference on machine learning, ICML-01, pp.569--576, 2001
* Expanded version of this paper appears in Computational Linguistics 27(3)

Via

Access Paper or Ask Questions

MAP Lexicon is useful for segmentation and word discovery in child-directed speech

Oct 14, 1999

Anand Venkataraman

Figure 1 for MAP Lexicon is useful for segmentation and word discovery in child-directed speech

Figure 2 for MAP Lexicon is useful for segmentation and word discovery in child-directed speech

Figure 3 for MAP Lexicon is useful for segmentation and word discovery in child-directed speech

Figure 4 for MAP Lexicon is useful for segmentation and word discovery in child-directed speech

Abstract:Because of rather fundamental changes to the underlying model proposed in the paper, it has been withdrawn from the archive.

* Because of rather fundamental changes to the underlying model proposed in the paper, it has been withdrawn from the archive.

Via

Access Paper or Ask Questions

A statistical model for word discovery in child directed speech

Oct 13, 1999

Anand Venkataraman

Figure 1 for A statistical model for word discovery in child directed speech

Figure 2 for A statistical model for word discovery in child directed speech

Figure 3 for A statistical model for word discovery in child directed speech

Figure 4 for A statistical model for word discovery in child directed speech

Abstract:A statistical model for segmentation and word discovery in child directed speech is presented. An incremental unsupervised learning algorithm to infer word boundaries based on this model is described and results of empirical tests showing that the algorithm is competitive with other models that have been used for similar tasks are also presented.

* 48 pgs, 10 figs

Via

Access Paper or Ask Questions